我们提出了一个数据收集和注释管道,该数据从越南放射学报告中提取信息,以提供胸部X射线(CXR)图像的准确标签。这可以通过注释与其特有诊断类别的数据相匹配,这些数据可能因国家而异。为了评估所提出的标签技术的功效,我们构建了一个包含9,752项研究的CXR数据集,并使用该数据集的子集评估了我们的管道。以F1得分为至少0.9923,评估表明,我们的标签工具在所有类别中都精确而始终如一。构建数据集后,我们训练深度学习模型,以利用从大型公共CXR数据集传输的知识。我们采用各种损失功能来克服不平衡的多标签数据集的诅咒,并使用各种模型体系结构进行实验,以选择提供最佳性能的诅咒。我们的最佳模型(CHEXPERT-FRECTER EDIDENENET-B2)的F1得分为0.6989(95%CI 0.6740,0.7240),AUC为0.7912,敏感性为0.7064,特异性为0.8760,普遍诊断为0.8760。最后,我们证明了我们的粗分类(基于五个特定的异常位置)在基准CHEXPERT数据集上获得了可比的结果(十二个病理),以进行一般异常检测,同时在所有类别的平均表现方面提供更好的性能。
translated by 谷歌翻译
自Bert(Devlin等,2018)以来,学习上下文化的单词嵌入一直是NLP中的事实上的标准。然而,学习上下文化短语嵌入的进展受到缺乏人类通知的语句基准基准的阻碍。为了填补这一空白,我们提出了PIC- 〜28K名词短语的数据集伴随着它们的上下文Wikipedia页面,以及一套三个任务,这些任务增加了评估短语嵌入质量的难度。我们发现,在我们的数据集中进行的培训提高了排名模型的准确性,并明显地将问题答案(QA)模型推向了近人类的准确性,而在语义搜索上,鉴于询问短语和段落,在语义搜索上是95%的精确匹配(EM)。有趣的是,我们发现这种令人印象深刻的性能的证据是因为质量检查模型学会了更好地捕获短语的共同含义,而不管其实际背景如何。也就是说,在我们的短语中歧义歧义(PSD)任务上,SOTA模型的精度大大下降(60%EM),在两个不同情况下未能区分相同短语的两种不同感觉。在我们的3任任务基准测试中的进一步结果表明,学习上下文化的短语嵌入仍然是一个有趣的开放挑战。
translated by 谷歌翻译
数十种归因方法背后的一个原理是在输入功能(此处,令牌)作为其归属中删除之前和之后的预测差异。流行的输入边缘化方法(IM)方法(Kim等,2020)使用BERT代替令牌,从而产生更合理的反事实。而Kim等人。 (2020)报道IM是有效的,我们发现这个结论并不令人信服,因为论文中使用的Deletionbert指标对IM有偏见。重要的是,这种偏见存在于基于缺失的指标中,包括插入,充分性和全面性。此外,我们使用6个指标和3个数据集的严格评估没有发现IM比剩余的(LOO)基线更好的证据。我们发现IM不比LOO更好的两个原因:(1)从输入中删除单个单词仅略微降低了分类器的精度; (2)一个高度可预测的词总是给出接近零的归因,无论其对分类器的真正重要性。相比之下,通过BERT使石灰样品更加自然可在几种咆哮指标下始终提高酸橙精度。
translated by 谷歌翻译
The introduction of high-quality image generation models, particularly the StyleGAN family, provides a powerful tool to synthesize and manipulate images. However, existing models are built upon high-quality (HQ) data as desired outputs, making them unfit for in-the-wild low-quality (LQ) images, which are common inputs for manipulation. In this work, we bridge this gap by proposing a novel GAN structure that allows for generating images with controllable quality. The network can synthesize various image degradation and restore the sharp image via a quality control code. Our proposed QC-StyleGAN can directly edit LQ images without altering their quality by applying GAN inversion and manipulation techniques. It also provides for free an image restoration solution that can handle various degradations, including noise, blur, compression artifacts, and their mixtures. Finally, we demonstrate numerous other applications such as image degradation synthesis, transfer, and interpolation.
translated by 谷歌翻译
现有的最新3D点云实例分割方法依赖于基于分组的方法,该方法指向获得对象实例。尽管产生准确的分割结果方面有所改善,但这些方法缺乏可扩展性,通常需要将大量输入分为多个部分。为了处理数百万点的场景,现有的最快方法软组\ cite {vu2022222222222222222222222222222222222222ggroup}需要数十秒钟,这是满意的。我们的发现是,$ k $ neart的邻居($ k $ -nn)是分组的先决条件,是计算瓶颈。这种瓶颈严重使现场的推理时间恶化了很多。本文提出了软组++来解决此计算瓶颈,并进一步优化了整个网络的推理速度。 SoftGroup ++建立在软组上,这在三个重要方面有所不同:(1)执行OCTREE $ K $ -NN而不是Vanilla $ k $ -nn,以将时间复杂性从$ \ Mathcal {o}(n^2)缩短到$ \ Mathcal {o}(n \ log n)$,(2)执行金字塔缩放,适应性下降样本骨干输出以减少$ k $ -nn和分组的搜索空间,并且(3)执行后期的Devoxelization,延迟了Voxels的转换指向模型的结束,以使中间组件以低计算成本运行。在各种室内和室外数据集上进行了广泛的实验,证明了拟议的软组++的功效。值得注意的是,SoftGroup ++在一个前方的情况下通过单个前方进行了大量的场景,而无需将输入分为多个部分,从而丰富了上下文信息。特别是,SoftGroup ++达到2.4点AP $ _ {50} $改进,而$ 6 \ $ 6 \ times $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $。代码和训练有素的模型将公开可用。
translated by 谷歌翻译
深度学习已成功地用于解决从大数据分析到计算机视觉和人级控制的各种复杂问题。但是,还采用了深度学习进步来创建可能构成隐私,民主和国家安全威胁的软件。最近出现的那些深度学习驱动的应用程序之一是Deepfake。 DeepFake算法可以创建人类无法将它们与真实图像区分开的假图像和视频。因此,可以自动检测和评估数字视觉媒体完整性的技术的建议是必不可少的。本文介绍了一项用于创造深击的算法的调查,更重要的是,提出的方法旨在检测迄今为止文献中的深击。我们对与Deepfake技术有关的挑战,研究趋势和方向进行了广泛的讨论。通过回顾深层味和最先进的深层检测方法的背景,本研究提供了深入的深层技术的概述,并促进了新的,更强大的方法的发展,以应对日益挑战性的深击。
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
We present the interpretable meta neural ordinary differential equation (iMODE) method to rapidly learn generalizable (i.e., not parameter-specific) dynamics from trajectories of multiple dynamical systems that vary in their physical parameters. The iMODE method learns meta-knowledge, the functional variations of the force field of dynamical system instances without knowing the physical parameters, by adopting a bi-level optimization framework: an outer level capturing the common force field form among studied dynamical system instances and an inner level adapting to individual system instances. A priori physical knowledge can be conveniently embedded in the neural network architecture as inductive bias, such as conservative force field and Euclidean symmetry. With the learned meta-knowledge, iMODE can model an unseen system within seconds, and inversely reveal knowledge on the physical parameters of a system, or as a Neural Gauge to "measure" the physical parameters of an unseen system with observed trajectories. We test the validity of the iMODE method on bistable, double pendulum, Van der Pol, Slinky, and reaction-diffusion systems.
translated by 谷歌翻译
Unmanned aerial vehicle (UAV) swarms are considered as a promising technique for next-generation communication networks due to their flexibility, mobility, low cost, and the ability to collaboratively and autonomously provide services. Distributed learning (DL) enables UAV swarms to intelligently provide communication services, multi-directional remote surveillance, and target tracking. In this survey, we first introduce several popular DL algorithms such as federated learning (FL), multi-agent Reinforcement Learning (MARL), distributed inference, and split learning, and present a comprehensive overview of their applications for UAV swarms, such as trajectory design, power control, wireless resource allocation, user assignment, perception, and satellite communications. Then, we present several state-of-the-art applications of UAV swarms in wireless communication systems, such us reconfigurable intelligent surface (RIS), virtual reality (VR), semantic communications, and discuss the problems and challenges that DL-enabled UAV swarms can solve in these applications. Finally, we describe open problems of using DL in UAV swarms and future research directions of DL enabled UAV swarms. In summary, this survey provides a comprehensive survey of various DL applications for UAV swarms in extensive scenarios.
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译